Modeling ETL activities as graphs

نویسندگان

  • Panos Vassiliadis
  • Alkis Simitsis
  • Spiros Skiadopoulos
چکیده

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. In this paper, we focus on the logical design of the ETL scenario of a data warehouse. Based on a formal logical model that includes the data stores, activities and their constituent parts, we model an ETL scenario as a graph, which we call the Architecture Graph. We model all the aforementioned entities as nodes and four different kinds of relationships (instance-of, part-of, regulator and provider relationships) as edges. In addition, we provide simple graph transformations that reduce the complexity of the graph. Finally, in order to support the engineering of the design and the evolution of the warehouse, we introduce specific importance metrics, namely dependence and responsibility, to measure the degree to which entities are bound to each other.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates

Extract-Transform-Load (ETL) workflows are data centric workflows responsible for transferring, cleaning, and loading data from their respective sources to the warehouse. In this paper, we build upon existing graph-based modeling techniques that treat ETL workflows as graphs by (a) extending the activity semantics to incorporate negation, aggregation and selfjoins, (b) complementing querying se...

متن کامل

The Conceptual Modeling of Etl Processes

An ETL process includes various ETL activities, such as filtering, aggregating, checking for null values, etc., which can be represented by the constraint functions and transforming operations defined in previous section. However, the activities cannot exist in an ETL process independently; they must be organized in certain order that is specified in an ETL task of the ETL process. We think tha...

متن کامل

Modeling and managing ETL processes

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. The design, development and deployment of ETL processes, which is currently, performed in an ad-hoc, in house fashion, needs modeling, design and methodological foundations. Unfortunately, the resear...

متن کامل

Quality measures for ETL processes: from goals to implementation

ETL processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key business entities. The apparent complexity of these activities has been examined through the prism of Business Process Management, mainly focusing on functional requirements an...

متن کامل

A Framework for ETL Systems Development

There are many commercial Extract-Transform-Load (ETL) tools, of which most of them do not offer an integrated platform for modeling processes and extending functionality. This drawback complicates the customization and integration with other applications, and consequently, many companies adopt internal development of their ETL systems. A possible solution is to create a framework to provide ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002